A structured speech model parameterized by recursive dynamics and neural networks
نویسندگان
چکیده
We present in this paper an overview of the Hidden Dynamic Model (HDM) paradigm, exemplifying parametric construction of structure-based speech models that can be used for recognition purposes. We explore a general class of the HDM that uses recursive, autoregression functions to represent the hidden speech dynamics, and uses neural networks to represent the functional relationship between the hidden and observed speech vectors. This type of state-space formulation of the HDM is reviewed in terms of model construction, a parameter estimation technique, and a decoding method. We also present some typical experimental results on the use of this type of HDMs for phonetic recognition and for automatic vocal tract resonance tracking. We further provide analyses on the computational complexity (for decoding) and the parameter size of the HDM in comparison with the HMM. Finally, we discuss several key issues related to future exploration of the HDM paradigm.
منابع مشابه
Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملDesign of an Intelligent Controller for Station Keeping, Attitude Control, and Path Tracking of a Quadrotor Using Recursive Neural Networks
During recent years there has been growing interest in unmanned aerial vehicles (UAVs). Moreover, the necessity to control and navigate these vehicles has attracted much attention from researchers in this field. This is mostly due to the fact that the interactions between turbulent airflows apply complex aerodynamic forces to the system. Since the dynamics of a quadrotor are non-linear and the ...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملOn the relationship between deterministic and probabilistic directed Graphical models: From Bayesian networks to recursive neural networks
Machine learning methods that can handle variable-size structured data such as sequences and graphs include Bayesian networks (BNs) and Recursive Neural Networks (RNNs). In both classes of models, the data is modeled using a set of observed and hidden variables associated with the nodes of a directed acyclic graph. In BNs, the conditional relationships between parent and child variables are pro...
متن کاملRecurrent networks for structured data - A unifying approach and its properties
We consider recurrent neural networks which deal with symbolic formulas, terms, or, generally speaking, tree-structured data. Approaches like the recursive autoassociative memory, discrete-time recurrent networks, folding networks, tensor construction, holographic reduced representations, and recursive reduced descriptions fall into this category. They share the basic dynamics of how structured...
متن کامل